Boosting Lite - Handling Larger Datasets and Slower Base Classifiers
نویسندگان
چکیده
In this paper, we examine ensemble algorithms (Boosting Lite and Ivoting) that provide accuracy approximating a single classifier, but which require significantly fewer training examples. Such algorithms allow ensemble methods to operate on very large data sets or use very slow learning algorithms. Boosting Lite is compared with Ivoting, standard boosting, and building a single classifier. Comparisons are done on 11 data sets to which other approaches have been applied. We find that ensembles of support vector machines can attain higher accuracy with less data than ensembles of decision trees. We find that Ivoting may result in higher accuracy ensembles on some data sets, however Boosting Lite is generally able to indicate when boosting will increase overall accuracy. keywords: classifier ensembles, boosting, support vector machines, decision trees
منابع مشابه
Bagging and Boosting with Dynamic Integration of Classifiers
One approach in classification tasks is to use machine learning techniques to derive classifiers using learning instances. The cooperation of several base classifiers as a decision committee has succeeded to reduce classification error. The main current decision committee learning approaches boosting and bagging use resampling with the training set and they can be used with different machine le...
متن کاملVipBoost: A More Accurate Boosting Algorithm
Boosting is a well-known method for improving the accuracy of many learning algorithms. In this paper, we propose a novel boosting algorithm, VipBoost (voting on boosting classifications from imputed learning sets), which first generates multiple incomplete datasets from the original dataset by randomly removing a small percentage of observed attribute values, then uses an imputer to fill in th...
متن کاملTransforming examples for multiclass boosting
AdaBoost.M2 and AdaBoost.MH are boosting algorithms for learning from multiclass datasets. They have received less attention than other boosting algorithms because they require base classifiers that can handle the pseudoloss or Hamming loss, respectively. The difficulty with these loss functions is that each example is associated with k weights, where k is the number of classes. We address this...
متن کاملAn Application of Oversampling, Undersampling, Bagging and Boosting in Handling Imbalanced Datasets
Most classifiers work well when the class distribution in the response variable of the dataset is well balanced. Problems arise when the dataset is imbalanced. This paper applied four methods: Oversampling, Undersampling, Bagging and Boosting in handling imbalanced datasets. The cardiac surgery dataset has a binary response variable (1=Died, 0=Alive). The sample size is 4976 cases with 4.2% (Di...
متن کاملCombining Bagging and Boosting
Bagging and boosting are among the most popular resampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging on noisefree data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, i...
متن کامل